{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Structual Decomposition Analysis\n",
    "\n",
    "Author: <a href=\"mailto:a.owen@leeds.ac.uk\"> Dr Anne Owen </a> \n",
    "\n",
    "Before we start, paste the following code into the box below:\n",
    "\n",
    "```python\n",
    "import numpy as np\n",
    "import pandas as pd\n",
    "import copy\n",
    "import matplotlib\n",
    "pd.options.display.precision = 2\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In this tutorial we will continue to use The World Input Output Database (WIOD) but we will use the datatables in constant 2001 prices. \n",
    "\n",
    "\n",
    "WIOD contains 30 countries and the Rest of the World, but this week we are going to focus solely on the UK for the period 2001-2014. \n",
    "\n",
    "Recall that data for 56 sectors are classified according to the International Standard Industrial Classification revision 3 (ISIC Rev. 3).\n",
    "\n",
    "We are particularly interested in the change in footprint from 2001 and will initially need a function that simply calculates the total footprint.\n",
    "\n",
    "## Exercise 1.1 Make total footprint function\n",
    "\n",
    "Copy this function into the box below:\n",
    "\n",
    "```python\n",
    "def total_footprint_calc(Z,Y,y_region,f):   \n",
    "    x = np.sum(Z,1) + np.sum(Y,1)\n",
    "    x[x==0] = 0.000000001\n",
    "    big_X = np.tile(np.transpose(x),[30*56,1])\n",
    "    A = Z/big_X \n",
    "    L = np.linalg.inv(np.identity(30*56)-A)\n",
    "    e = f/x \n",
    "    eL = np.dot(e,L)\n",
    "    footprint = np.dot(eL,y_region)\n",
    "\n",
    "    return footprint\n",
    "\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Exercise 1.2: Load the data\n",
    "\n",
    "Before we can load the data we must set up some empty dictionaries. Write:\n",
    "\n",
    "```python\n",
    "Z = {}\n",
    "Y = {}\n",
    "f = {}\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Each individual year of data for each of ```Z```, ```Y``` and ```f``` is contained in a csv file.\n",
    "\n",
    "The country name and sector names are found in the first column and row of the ```Z``` files.\n",
    "\n",
    "The country names are found in the first row of the ```Y``` files and the country name and sector names are found in the first column of the ```Y``` files.\n",
    "\n",
    "The country name and sector names are found in the first row of the ```f``` files.\n",
    "\n",
    "Use following code, similar to the last set of exercises:\n",
    "\n",
    "```python\n",
    "for yr in range (2000,2015):\n",
    "    print(yr)\n",
    "    Z[yr] = pd.read_csv('WIODv2016_data_deflated/z_' +str(yr)+ '_2000prices.csv', header=0,index_col=0)\n",
    "    Y[yr] = pd.read_csv('WIODv2016_data_deflated/y_' +str(yr)+ '_2000prices.csv', header=0,index_col=0)\n",
    "    f[yr] = np.transpose(pd.read_csv('WIODv2016_data_deflated/f_' +str(yr)+ '.csv', header=0,index_col=0))\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Exercise 1.3 Using a for-loop to find the UK footprint for the years 2000 to 2014\n",
    "\n",
    "First let's create a varaible years that contains the range 2000-2014\n",
    "\n",
    "```python\n",
    "    years = range(2000,2015)\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now we will set up a new empty variable called data which will have one column for  the carbon footprint for the years 2000 to 2015 and a second column which will show change over time from 2000. \n",
    "\n",
    "Try:\n",
    "\n",
    "```python\n",
    "    data = np.zeros((15,2))\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now we are going to set up a for-loop to call the footprint function for every year in ```years```. \n",
    "\n",
    "We need to set the ```y_region``` variable to use the column for the UK. We do this using ```Y.loc[:,'GBR']```. Selecting the column from Y which has 'GBR' as the column heading.\n",
    "\n",
    "We will put the result into our data variable. The problem is that we will need to tell Python which cell to put the result in. This is where the enumerate function becomes really useful. Enumerate will count how many times we have looped through the function and we can use it to decide which cell to fill.\n",
    "\n",
    "Finally, we calculate the change from 1995 by looking up the current year's footprint ```data[count,0]``` and subtracting the footprint recorded for 1995 found in the first row, ```data[0,0]```\n",
    "\n",
    "Try:\n",
    "\n",
    "```python\n",
    "for count, yr in enumerate(years):\n",
    "    print(yr)\n",
    "    data[count,0] = total_footprint_calc(Z[yr],Y[yr],Y[yr].loc[:,'GBR'],f[yr])\n",
    "    data[count,1] = data[count,0]-data[0,0]\n",
    "        \n",
    "UK_foot = pd.DataFrame(data,index=years,columns=['footprint','change from 2000'])\n",
    "UK_foot\n",
    "```\n",
    "\n",
    "The last two lines of code make and display a nice dataframe with headings."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now lets make a graph of change in footprint from 2001.\n",
    "\n",
    "Try\n",
    "```python\n",
    "change = UK_foot.iloc[:,1]\n",
    "chart = change.plot( kind = 'line')\n",
    "chart.set_ylabel('change from 2000 (kilotonnes CO2)')\n",
    "```\n",
    "\n",
    "The numbers are ever-so slightly different from those calculated last week because we are using the constant price database and the allocations of global emissions shift a tiny bit."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Exercise 2.1 Checking the Structural Decomposition Equation\n",
    "\n",
    "For the decomposition work, because we want our decomposition to align with IPAT/Kaya principles, we have rewritten the footprint equation from:\n",
    "\n",
    "$$\n",
    "\\mathbf{Q = eLy}\n",
    "$$\n",
    "\n",
    "to \n",
    "\n",
    "$$\n",
    "\\mathbf{Q=isp}\n",
    "$$\n",
    "\n",
    "where $\\mathbf{i = eL}$ and $\\mathbf{s =} \\frac{y}{p}$\n",
    "\n",
    "Before we start calculating decompositions, let's check that this new equation works.\n",
    "\n",
    "First you will need a vector of UK population by year.\n",
    "\n",
    "```python\n",
    "pop_data = [58893000,59120000,59370000,59648000,59999000,60401000,60847000,61322000,61807000,62276000,62766000,63259000,63700000,64128000,64602000]\n",
    "\n",
    "p = pd.DataFrame(pop_data,index=years)\n",
    "display(p)\n",
    "```\n",
    "\n",
    "These population figures have been taken from the World Bank"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Now you need to make $\\mathbf{i=eL}$\n",
    "\n",
    "We can take code from the footprint function at the start of this workbook.\n",
    "\n",
    "First we set up a data dictionary for i using ```i={}```.\n",
    "\n",
    "The final part of the function places the new dataframe ```i_data```, which is made multiplying ```e``` by ```L```, into the dictionary for each year.\n",
    "\n",
    "```python\n",
    "i = {}\n",
    "for yr in years:\n",
    "    x = np.sum(Z[yr],1) + np.sum(Y[yr],1)\n",
    "    x[x==0] = 0.000000001\n",
    "    big_X = np.tile(np.transpose(x),[30*56,1])\n",
    "    A = Z[yr]/big_X \n",
    "    L = np.linalg.inv(np.identity(30*56)-A)\n",
    "    e = f[yr]/x \n",
    "    i_data = pd.DataFrame(np.dot(e,L),columns=Y[2000].index)\n",
    "    i[yr] = i_data\n",
    "    \n",
    "display(i[2000])\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Remember, $\\mathbf{i=eL}$, is a row vector showing full supply chain product emissions per $ spend\n",
    "\n",
    "now we make $\\mathbf{s=\\frac{y}{p}}$. This is spend on products per capita.\n",
    "\n",
    "Try\n",
    "\n",
    "```python\n",
    "s = {}\n",
    "for yr in years:\n",
    "    s_data = pd.DataFrame(Y[yr].loc[:,'GBR'].values/p.loc[yr].values,index = Y[2000].index)\n",
    "    s[yr] = s_data\n",
    "display(s[2000])\n",
    "```\n",
    "\n",
    "This is quite straightforward. The only complication here is getting the correct values from the population dataframe ```p```. Remember to select values from a dataframe that correspond to a row heading, we use ```loc```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Remember, $\\mathbf{s=\\frac{y}{p}}$, is a column vector showing spend per product per capita. The units in this case are $million per person.\n",
    "\n",
    "\n",
    "We are now going to multiply everything together: $\\mathbf{Q=isp}$\n",
    "\n",
    "Because ```i``` and ```s``` are vectors, we will be using matrix functions. A row vector multiplied by a column vector should give a single value. Population is a scalar (a single number) so is simply multipled at the end. \n",
    "\n",
    "We'll put our new data into a DataFrame ```UK_foot2``` and check that it is the same as ```UK_foot```, calculated earlier.\n",
    "\n",
    "```python\n",
    "check_data = np.zeros((15,2))\n",
    "\n",
    "for count, yr in enumerate(years):\n",
    "    check_data[count,0] = np.dot(i[yr],s[yr])*p.loc[yr].values\n",
    "    check_data[count,1] = check_data[count,0]-check_data[0,0]\n",
    "\n",
    "\n",
    "UK_foot2 = pd.DataFrame(check_data,index=years,columns=['footprint','change from 1995'])\n",
    "UK_foot2\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Has it worked? \n",
    "\n",
    "Have we successfully calculated the footprint using variables that resemble IPAT/Kaya?\n",
    "\n",
    "Now let's use Structural Decomposition Analysis to find out the effect of change in product emissions intensities, changes in spend per person and population change on the change in the UK's footprint!\n",
    "\n",
    "## Exercise 2.2 Making the Structural Decomposition Equations\n",
    "\n",
    "First of all, let's consider two years 2000 and 2007.\n",
    "\n",
    "We'll find out how each variable contributed to the increase in emissions between 2000 and 2007.\n",
    "\n",
    "Let $t_0$ = 2000 and $t_1$ = 2007.\n",
    "\n",
    "We need to make $\\mathbf{i_0}$, $\\mathbf{i_1}$, $\\mathbf{\\Delta i}$, $\\mathbf{s_0}$, $\\mathbf{s_1}$, $\\mathbf{\\Delta s}$, $\\mathbf{p_0}$, $\\mathbf{p_1}$, $\\mathbf{\\Delta p}$\n",
    "\n",
    "Try\n",
    "\n",
    "```python\n",
    "i_0 = i[2000]\n",
    "i_1 = i[2007]\n",
    "D_i = i_1-i_0\n",
    "\n",
    "s_0 = s[2000]\n",
    "s_1 = s[2007]\n",
    "D_s = s_1-s_0\n",
    "\n",
    "p_0 = p.loc[2000]\n",
    "p_1 = p.loc[2007]\n",
    "D_p = p_1-p_0\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now we need to make $\\mathbf{i_{effect}}$, $\\mathbf{s_{effect}}$ and $\\mathbf{p_{effect}}$\n",
    "\n",
    "Remember \n",
    "\n",
    "$\\mathbf{i_{effect}} = \\frac{1}{2}(\\mathbf{\\Delta i}\\mathbf{s_0}\\mathbf{p_0}+\\mathbf{\\Delta i}\\mathbf{s_1}\\mathbf{p_1})$\n",
    "\n",
    "$\\mathbf{s_{effect}} = \\frac{1}{2}(\\mathbf{i_0}\\mathbf{\\Delta{s}}\\mathbf{p_1}+\\mathbf{i_1}\\mathbf{\\Delta{s}}\\mathbf{p_0})$\n",
    "\n",
    "$\\mathbf{p_{effect}} = \\frac{1}{2}(\\mathbf{i_1}\\mathbf{s_1}\\mathbf{\\Delta{p}}+\\mathbf{i_0}\\mathbf{s_0}\\mathbf{\\Delta{p}})$\n",
    "\n",
    "Here is the code for ```i_effect``` and ```p_effect```\n",
    "\n",
    "Can you run these and also generate ```s_effect```?\n",
    "\n",
    "```python\n",
    "\n",
    "i_effect = 0.5*((np.dot(D_i,s_0)*p_0.values)+(np.dot(D_i,s_1)*p_1.values))\n",
    "\n",
    "\n",
    "p_effect = 0.5*((np.dot(i_1,s_1)*D_p.values)+(np.dot(i_0,s_0)*D_p.values))\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Try:\n",
    "\n",
    "```python\n",
    "print('The effect of a change in product intensities is ', int(i_effect[0,0]), ' kilotonnes CO2') \n",
    "\n",
    "print('The effect of a change in spend on products per person  is ', int(s_effect[0,0]), ' kilotonnes CO2') \n",
    "\n",
    "print('The effect of a change population is ', int(p_effect[0,0]), ' kilotonnes CO2')\n",
    "\n",
    "total = i_effect + s_effect + p_effect\n",
    "\n",
    "print('The total change  is ', int(total[0,0]), ' kilotonnes CO2')\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You should have found that the change in product emissions has an effect of reducing the UK footprint by 62,442 Ktonnes CO2. The spend per person increased it by 133,725 Ktonnes and population increase increases it by 26,355 Mtonnes.\n",
    "\n",
    "\n",
    "The overall effect is an increase of 97,638 Ktonnes CO2 from 2000. \n",
    "\n",
    "This should match the value in your table."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Exercise 2.3 What about the recession?\n",
    "\n",
    "Can you repeat what you did in the previous exercise but now  $t_0$=2007 and $t_1$ = 2014?\n",
    "\n",
    "This will look at how each variable contributed to the decrease in emissions between 2007 and 2014.\n",
    "\n",
    "Start by making $\\mathbf{i_0}$, $\\mathbf{i_1}$, $\\mathbf{\\Delta i}$, $\\mathbf{s_0}$, $\\mathbf{s_1}$, $\\mathbf{\\Delta s}$, $\\mathbf{p_0}$, $\\mathbf{p_1}$, $\\mathbf{\\Delta p}$"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now we need to make $\\mathbf{i_{effect}}$, $\\mathbf{s_{effect}}$ and $\\mathbf{p_{effect}}$ and display the results."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You should have found that the change in product emissions has an effect of decreasing the UK footprint by 86,942 Ktonnes CO2. The spend per person decreased it by 78,388 Ktonnes and population increase increases it by 33,227 Mtonnes.\n",
    "\n",
    "The overall effect is a decrease of 132,104 Ktonnes CO2 between 2007 and 2014. \n",
    "\n",
    "This should match the value in your table when you subtract emissions in 2014 from emissions in 2007.\n",
    "\n",
    "Try\n",
    "\n",
    "```python\n",
    "UK_foot2.loc[2014,'footprint']-UK_foot2.loc[2007,'footprint']\n",
    "```\n",
    "\n",
    "to check"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Exercise 3.1 Putting it all together in a for loop\n",
    "\n",
    "We are now going to put all of this together in a for loop which calculates the change from 1995 for each year.\n",
    "\n",
    "\n",
    "Try:\n",
    "\n",
    "```python\n",
    "sda_data = np.zeros((15,4))\n",
    "\n",
    "for count, yr in enumerate(years):\n",
    "    i_0 = i[2000]\n",
    "    i_1 = i[yr]\n",
    "    delta_i = i_1-i_0\n",
    "    s_0 = s[2000]\n",
    "    s_1 = s[yr]\n",
    "    delta_s = s_1-s_0\n",
    "    p_0 = p.loc[2000]\n",
    "    p_1 = p.loc[yr]\n",
    "    delta_p = p_1-p_0\n",
    "    \n",
    "    i_effect = 0.5*((np.dot(delta_i,s_0)*p_0.values)+(np.dot(delta_i,s_1)*p_1.values))\n",
    "    s_effect = 0.5*((np.dot(i_0,delta_s)*p_1.values)+(np.dot(i_1,delta_s)*p_0.values))\n",
    "    p_effect = 0.5*((np.dot(i_1,s_1)*delta_p.values)+(np.dot(i_0,s_0)*delta_p.values))\n",
    "    \n",
    "    sda_data[count,0] = i_effect[0,0]\n",
    "    sda_data[count,1] = s_effect[0,0]\n",
    "    sda_data[count,2] = p_effect[0,0]\n",
    "    sda_data[count,3] = sda_data[count,0]+sda_data[count,1]+sda_data[count,2]\n",
    "    \n",
    "UK_SDA = pd.DataFrame(sda_data,index=years,columns=['Product emissions','Spend per capita','Population','Total change'])\n",
    "UK_SDA\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Can you make the following Chart?\n",
    "\n",
    "<img width=\"483\" alt=\"Screenshot 2022-01-28 at 14 35 01\" src=\"https://github.com/earao/images/blob/main/Screenshot%202023-12-22%20at%2015.53.07.png?raw=true\" />"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Bonus Exercise 4.1 Can you try this for a different country?\n",
    "\n",
    "You'll need to find population data for your country of choice.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Bonus Exercise 4.2 Can you try alternative decompositions?\n",
    "\n",
    "You might want to try:\n",
    "\n",
    "$$Q = eLsp$$\n",
    "\n",
    "or what happens if you try and split spend per product per person into two further variables where $s$ is the product of basket of goods multiplied by total per capita spend\n",
    "\n",
    "This is starting to get really tricky. You might find the matrix multiplication causes headaches. This is why we always set up a checking table first.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Key learning points\n",
    "You should have learnt:\n",
    "<ol>\n",
    "<li>How to rewrite the Leontief equation into something more like IPAT/Kaya</li>\n",
    "<li>How to form the variables that can be used in a Structural Decomposition Analysis</li>\n",
    "<li>How to calculate SDA</li>\n",
    "<li>How interpret the results of a SDA</li>\n",
    "   \n",
    "</ol>"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.4"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
